Stochastic Attribute Selection Committees
نویسندگان
چکیده
Classi er committee learning methods generate multiple classi ers to form a committee by repeated application of a single base learning algorithm. The committee members vote to decide the nal classication. Two such methods, Bagging and Boosting, have shown great success with decision tree learning. They create di erent classi ers by modifying the distribution of the training set. This paper studies a different approach: Stochastic Attribute Selection Committee learning of decision trees. It generates classi er committees by stochastically modifying the set of attributes but keeping the distribution of the training set unchanged. An empirical evaluation of a variant of this method, namely Sasc, in a representative collection of natural domains shows that the SASC method can signi cantly reduce the error rate of decision tree learning. On average Sasc is more accurate than Bagging and less accurate than Boosting, although a one-tailed sign-test fails to show that these di erences are signi cant at a level of 0.05. In addition, it is found that, like Bagging, Sasc is more stable than Boosting in terms of less frequently obtaining signi cantly higher error rates than C4.5 and, when error is raised, producing lower error rate increases. Moreover, like Bagging, Sasc is amenable to parallel and distributed processing while Boosting is not.
منابع مشابه
Integrating boosting and stochastic attribute selection committees for further improving the performance of decision tree learning
Techniques for constructing classiier committees including Boosting and Bagging have demonstrated great success, especially Boosting for decision tree learning. This type of technique generates several classiiers to form a committee by repeated application of a single base learning algorithm. The committee members vote to decide the nal classiication. Boosting and Bagging create diierent classi...
متن کاملStochastic Attribute Selection Committees withMultiple Boosting : Learning More
Classiier learning is a key technique for KDD. Approaches to learning classiier committees, including Boosting, Bagging, Sasc, and SascB, have demonstrated great success in increasing the prediction accuracy of decision trees. Boosting and Bagging create diierent classiiers by modifying the distribution of the training set. Sasc adopts a diierent method. It generates committees by stochastic ma...
متن کاملReversible Stochastic Attribute-Value Grammars
An attractive property of attribute-value grammars is their reversibility. Attribute-value grammars are usually coupled with separate statistical components for parse selection and fluency ranking. We propose reversible stochastic attribute-value grammars, in which a single statistical model is employed both for parse selection and fluency ranking.
متن کاملGRAPH: The Costs of Redundancy in Referring Expressions
We describe a graph-based generation system that participated in the TUNA attribute selection and realisation task of the REG 2008 Challenge. Using a stochastic cost function (with certain properties for free), and trying attributes from cheapest to more expensive, the system achieves overall .76 DICE and .54 MASI scores for attribute selection on the development set. For realisation, it turns ...
متن کاملA stochastic model for project selection and scheduling problem
Resource limitation in zero time may cause to some profitable projects not to be selected in project selection problem, thus simultaneous project portfolio selection and scheduling problem has received significant attention. In this study, budget, investment costs and earnings are considered to be stochastic. The objectives are maximizing net present values of selected projects and minimizing v...
متن کامل